Goto

Collaborating Authors

 geometric distortion




MaXsive: High-Capacity and Robust Training-Free Generative Image Watermarking in Diffusion Models

arXiv.org Artificial Intelligence

The great success of the diffusion model in image synthesis led to the release of gigantic commercial models, raising the issue of copyright protection and inappropriate content generation. Training-free diffusion watermarking provides a low-cost solution for these issues. However, the prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Although some methods employ meticulously designed patterns to mitigate this issue, they often reduce watermark capacity, which can result in identity (ID) collusion. To address these problems, we propose MaXsive, a training-free diffusion model generative watermarking technique that has high capacity and robustness. MaXsive best utilizes the initial noise to watermark the diffusion model. Moreover, instead of using a meticulously repetitive ring pattern, we propose injecting the X-shape template to recover the RST distortions. This design significantly increases robustness without losing any capacity, making ID collusion less likely to happen. The effectiveness of MaXsive has been verified on two well-known watermarking benchmarks under the scenarios of verification and identification.


Inspecting the Representation Manifold of Differentially-Private Text

arXiv.org Artificial Intelligence

Differential Privacy (DP) for text has recently taken the form of text paraphrasing using language models and temperature sampling to better balance privacy and utility. However, the geometric distortion of DP regarding the structure and complexity in the representation space remains unexplored. By estimating the intrinsic dimension of paraphrased text across varying privacy budgets, we find that word-level methods severely raise the representation manifold, while sentence-level methods produce paraphrases whose manifolds are topologically more consistent with human-written paraphrases. Among sentence-level methods, masked paraphrasing, compared to causal paraphrasing, demonstrates superior preservation of structural complexity, suggesting that autoregressive generation propagates distortions from unnatural word choices that cascade and inflate the representation space.


A Geometric Perspective for High-Dimensional Multiplex Graphs

arXiv.org Artificial Intelligence

High-dimensional multiplex graphs are characterized by their high number of complementary and divergent dimensions. The existence of multiple hierarchical latent relations between the graph dimensions poses significant challenges to embedding methods. In particular, the geometric distortions that might occur in the representational space have been overlooked in the literature. This work studies the problem of high-dimensional multiplex graph embedding from a geometric perspective. We find that the node representations reside on highly curved manifolds, thus rendering their exploitation more challenging for downstream tasks. Moreover, our study reveals that increasing the number of graph dimensions can cause further distortions to the highly curved manifolds. To address this problem, we propose a novel multiplex graph embedding method that harnesses hierarchical dimension embedding and Hyperbolic Graph Neural Networks. The proposed approach hierarchically extracts hyperbolic node representations that reside on Riemannian manifolds while gradually learning fewer and more expressive latent dimensions of the multiplex graph. Experimental results on real-world high-dimensional multiplex graphs show that the synergy between hierarchical and hyperbolic embeddings incurs much fewer geometric distortions and brings notable improvements over state-of-the-art approaches on downstream tasks.


Deformation-Invariant Neural Network and Its Applications in Distorted Image Restoration and Analysis

arXiv.org Artificial Intelligence

Images degraded by geometric distortions pose a significant challenge to imaging and computer vision tasks such as object recognition. Deep learning-based imaging models usually fail to give accurate performance for geometrically distorted images. In this paper, we propose the deformation-invariant neural network (DINN), a framework to address the problem of imaging tasks for geometrically distorted images. The DINN outputs consistent latent features for images that are geometrically distorted but represent the same underlying object or scene. The idea of DINN is to incorporate a simple component, called the quasiconformal transformer network (QCTN), into other existing deep networks for imaging tasks. The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images. It first outputs a Beltrami coefficient, which measures the quasiconformality of the output deformation map. By controlling the Beltrami coefficient, the local geometric distortion under the quasiconformal mapping can be controlled. The QCTN is lightweight and simple, which can be readily integrated into other existing deep neural networks to enhance their performance. Leveraging our framework, we have developed an image classification network that achieves accurate classification of distorted images. Our proposed framework has been applied to restore geometrically distorted images by atmospheric turbulence and water turbulence. DINN outperforms existing GAN-based restoration methods under these scenarios, demonstrating the effectiveness of the proposed framework. Additionally, we apply our proposed framework to the 1-1 verification of human face images under atmospheric turbulence and achieve satisfactory performance, further demonstrating the efficacy of our approach. Deep learning methods have made significant strides in the field of imaging and computer vision, allowing us to achieve remarkable results in tasks like image restoration, object recognition, and classification. However, when it comes to degraded images, deep learning methods can face significant challenges.


Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving

arXiv.org Artificial Intelligence

--A semantic map of the road scene, covering fundamental road elements, is an essential ingredient in autonomous driving systems. It provides important perception foundations for positioning and planning when rendered in the Bird's-Eye-View (BEV). Currently, the prior knowledge of hypothetical depth can guide the learning of translating front perspective views into BEV directly with the help of calibration parameters. However, it suffers from geometric distortions in the representation of distant objects. In addition, another stream of methods without prior knowledge can learn the transformation between front perspective views and BEV implicitly with a global view. Considering that the fusion of different learning methods may bring surprising beneficial effects, we propose a Bi-Mapper framework for top-down road-scene semantic understanding, which incorporates a global view and local prior knowledge. T o enhance reliable interaction between them, an asynchronous mutual learning strategy is proposed. At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions. Extensive results on nuScenes and Cam2BEV datasets verify the consistent effectiveness of each module in the proposed Bi-Mapper framework. Compared with exiting road mapping networks, the proposed Bi-Mapper achieves 2 . Moreover, we verify the generalization performance of Bi-Mapper in a real-world driving scenario. The source code is publicly available at BiMapper. N autonomous driving systems, a semantic map is an important basic element, which affects the downstream working, including location and planning. Recently, the Bird' s-Eye-View (BEV) map has shown an outstanding performance [1].


QC-SPHRAM: Quasi-conformal Spherical Harmonics Based Geometric Distortions on Hippocampal Surfaces for Early Detection of the Alzheimer's Disease

arXiv.org Machine Learning

We propose a disease classification model, called the QC-SPHARM, for the early detection of the Alzheimer's Disease (AD). The proposed QC-SPHARM can distinguish between normal control (NC) subjects and AD patients, as well as between amnestic mild cognitive impairment (aMCI) patients having high possibility progressing into AD and those who do not. Using the spherical harmonics (SPHARM) based registration, hippocampal surfaces segmented from the ADNI data are individually registered to a template surface constructed from the NC subjects using SPHARM. Local geometric distortions of the deformation from the template surface to each subject are quantified in terms of conformality distortions and curvatures distortions. The measurements are combined with the spherical harmonics coefficients and the total volume change of the subject from the template. Afterwards, a t-test based feature selection method incorporating the bagging strategy is applied to extract those local regions having high discriminating power of the two classes. The disease diagnosis machine can therefore be built using the data under the Support Vector Machine (SVM) setting. Using 110 NC subjects and 110 AD patients from the ADNI database, the proposed algorithm achieves 85:2% testing accuracy on 80 random samples as testing subjects, with the incorporation of surface geometry in the classification machine. Using 20 aMCI patients who has advanced to AD during a two-year period and another 20 aMCI patients who remain non-AD for the next two years, the algorithm achieves 81:2% accuracy using 10 randomly picked subjects as testing data. Our proposed method is 6%-15% better than other classification models without the incorporation of surface geometry. The results demonstrate the advantages of using local geometric distortions as the discriminating criterion for early AD diagnosis.